Search CORE

31 research outputs found

An Approach to Answer: "How Tree-Like is a Network"

Author: Duck Geraint
Publication venue: University of Manchester
Publication date: 01/09/2010
Field of study

The University of Manchester - Institutional Repository

Ambiguity and variability of database and software names in bioinformatics

Author: Duck Geraint
Nenadic Goran
Robertson David
Stevens Robert
Publication venue
Publication date: 01/01/2012
Field of study

The University of Manchester - Institutional Repository

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

Author: Aerts Jan
Akune Yukie
Antezana Erick
Aoki-Kinoshita Kiyoko F
Arakawa Kazuharu
Aranda Bruno
Baran Joachim
Bolleman Jerven
Bonnal Raoul JP
Bono Hidemasa
Buttigieg Pier Luigi
Campbell Matthew P
Chen Yi-an
Chiba Hirokazu
Cock Peter JA
Cohen K Bretonnel
Constantin Alexandru
Duck Geraint
Dumontier Michel
Fujisawa Takatomo
Fujiwara Toyofumi
Goto Naohisa
Hoehndorf Robert
Igarashi Yoshinobu
Itaya Hidetoshi
Ito Maori
Iwasaki Wataru
Kalaš Matúš
Kano Yoshinobu
Katayama Toshiaki
Katoda Takeo
Kawamoto Shoko
Kawano Shin
Kawashima Shuichi
Kim Jin-Dong
Kim Taehong
Kocbek Simon
Kokubu Anna
Komiyama Yusuke
Kotera Masaaki
Laibe Camille
Lapp Hilmar
Lütteke Thomas
Marshall M Scott
Mori Hiroshi
Mori Takaaki
Morita Mizuki
Murakami Katsuhiko
Nakao Mitsuteru
Narimatsu Hisashi
Nishide Hiroyo
Nishimura Yosuke
Nystrom-Persson Johan
Ogishima Soichi
Okamoto Shinobu
Okamura Yasunobu
Okuda Shujiro
Ono Hiromasa
Oshita Kazuki
Packer Nicki H
Prins Pjotr
Ranzinger Rene
Rocca-Serra Philippe
Sansone Susanna
Sawaki Hiromichi
Shin Sung-Ho
Splendiani Andrea
Strozzi Francesco
Tadaka Shu
Takagi Toshihisa
Toukach Philip
Uchiyama Ikuo
Umezaki Masahito
Vos Rutger
Wang Yue
Whetzel Patricia L
Wilkinson Mark D
Wu Hongyan
Yamada Issaku
Yamaguchi Atsuko
Yamamoto Yasunori
Yamasaki Chisato
Yamashita Riu
York William S
Zmasek Christian M
Publication venue
Publication date: 01/01/2014
Field of study

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed

Maastricht University Research Portal

University of Bergen

Aberystwyth Research Portal

Springer - Publisher Connector

PubMed Central

Electronic Publication Information Center

NORA - Norwegian Open Research Archives

Macquarie University ResearchOnline

Access to Research at National University of Ireland, Galway

Extraction of database and software usage patterns from the bioinformatics literature

Author: Duck Geraint
Publication venue
Publication date: 01/08/2015
Field of study

The University of Manchester - Institutional Repository

Software and database resource mentions across the whole of PubMed Central full-text articles

Author: Geraint Duck (679903)
Publication venue
Publication date
Field of study

This is a compressed .sql.gz file of a MySQL database dump. The table contains the automatically extracted mentions of database and software resource names as extracted by bioNerDS across the full sub-set of open-access full-text PubMed Central articles. Each matched resource is identified by name, text offsets and "normalised" name, and also includes details of the rules from which the name was matched. This dataset is one of the primary research contributions of my PhD work, and a paper currently being finalised for submission to PLoS Computational Biology. </p

FigShare

PubMed Central literature composition and analysis

Author: Geraint Duck (679903)
Publication venue
Publication date
Field of study

Compressed MySQL data dump of literature composition analyses including total token, sentence and syllable counts, Flesch readability scores, nouns, verbs and adjectives for the complete full-text open-access subset of PubMed Central. This is one of the research contributions of my PhD.</p

FigShare

Extracting patterns of database and software usage from the bioinformatics literature

Author: Brass Andy
Duck Geraint
Nenadic Goran
Robertson David L.
Stevens Robert
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/08/2014
Field of study

Motivation: As a natural consequence of being a computer-based discipline, bioinformatics has a strong focus on database and software development, but the volume and variety of resources are growing at unprecedented rates. An audit of database and software usage patterns could help provide an overview of developments in bioinformatics and community common practice, and comparing the links between resources through time could demonstrate both the persistence of existing software and the emergence of new tools. Results: We study the connections between bioinformatics resources and construct networks of database and software usage patterns, based on resource co-occurrence, that correspond to snapshots of common practice in the bioinformatics community. We apply our approach to pairings of phylogenetics software reported in the literature and argue that these could provide a stepping stone into the identification of scientific best practice. Availability and implementation: The extracted resource data, the scripts used for network generation and the resulting networks are available at http://bionerds.sourceforge.net/networks/

PubMed Central

The University of Manchester - Institutional Repository

Enlighten

Ambiguity and variability of database and software names in bioinformatics

Author: Duck Geraint
Kovacevic Aleksandar
Nenadic Goran
Robertson David L
Stevens Robert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/06/2015
Field of study

Background: There are numerous options available to achieve various tasks in bioinformatics, but until recently, there were no tools that could systematically identify mentions of databases and tools within the literature. In this paper we explore the variability and ambiguity of database and software name mentions and compare dictionary and machine learning approaches to their identification. Results: Through the development and analysis of a corpus of 60 full-text documents manually annotated at the mention level, we report high variability and ambiguity in database and software mentions. On a test set of 25 full-text documents, a baseline dictionary look-up achieved an F-score of 46 %, highlighting not only variability and ambiguity but also the extensive number of new resources introduced. A machine learning approach achieved an F-score of 63 % (with precision of 74 %) and 70 % (with precision of 83 %) for strict and lenient matching respectively. We characterise the issues with various mention types and propose potential ways of capturing additional database and software mentions in the literature. Conclusions: Our analyses show that identification of mentions of databases and tools is a challenging task that cannot be achieved by relying on current manually-curated resource repositories. Although machine learning shows improvement and promise (primarily in precision), more contextual information needs to be taken into account to achieve a good degree of accuracy

Springer - Publisher Connector

PubMed Central

The University of Manchester - Institutional Repository

Enlighten